520 research outputs found
Finding the Median (Obliviously) with Bounded Space
We prove that any oblivious algorithm using space to find the median of a
list of integers from requires time . This bound also applies to the problem of determining whether the median
is odd or even. It is nearly optimal since Chan, following Munro and Raman, has
shown that there is a (randomized) selection algorithm using only
registers, each of which can store an input value or -bit counter,
that makes only passes over the input. The bound also implies
a size lower bound for read-once branching programs computing the low order bit
of the median and implies the analog of for length oblivious branching programs
GraCT: A Grammar based Compressed representation of Trajectories
We present a compressed data structure to store free trajectories of moving
objects (ships over the sea, for example) allowing spatio-temporal queries. Our
method, GraCT, uses a -tree to store the absolute positions of all objects
at regular time intervals (snapshots), whereas the positions between snapshots
are represented as logs of relative movements compressed with Re-Pair. Our
experimental evaluation shows important savings in space and time with respect
to a fair baseline.Comment: This research has received funding from the European Union's Horizon
2020 research and innovation programme under the Marie Sk{\l}odowska-Curie
Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094
A Faster Implementation of Online Run-Length Burrows-Wheeler Transform
Run-length encoding Burrows-Wheeler Transformed strings, resulting in
Run-Length BWT (RLBWT), is a powerful tool for processing highly repetitive
strings. We propose a new algorithm for online RLBWT working in run-compressed
space, which runs in time and bits of space, where
is the length of input string received so far and is the number of runs
in the BWT of the reversed . We improve the state-of-the-art algorithm for
online RLBWT in terms of empirical construction time. Adopting the dynamic list
for maintaining a total order, we can replace rank queries in a dynamic wavelet
tree on a run-length compressed string by the direct comparison of labels in a
dynamic list. The empirical result for various benchmarks show the efficiency
of our algorithm, especially for highly repetitive strings.Comment: In Proc. IWOCA201
Tree Compression with Top Trees Revisited
We revisit tree compression with top trees (Bille et al, ICALP'13) and
present several improvements to the compressor and its analysis. By
significantly reducing the amount of information stored and guiding the
compression step using a RePair-inspired heuristic, we obtain a fast compressor
achieving good compression ratios, addressing an open problem posed by Bille et
al. We show how, with relatively small overhead, the compressed file can be
converted into an in-memory representation that supports basic navigation
operations in worst-case logarithmic time without decompression. We also show a
much improved worst-case bound on the size of the output of top-tree
compression (answering an open question posed in a talk on this algorithm by
Weimann in 2012).Comment: SEA 201
Low Space External Memory Construction of the Succinct Permuted Longest Common Prefix Array
The longest common prefix (LCP) array is a versatile auxiliary data structure
in indexed string matching. It can be used to speed up searching using the
suffix array (SA) and provides an implicit representation of the topology of an
underlying suffix tree. The LCP array of a string of length can be
represented as an array of length words, or, in the presence of the SA, as
a bit vector of bits plus asymptotically negligible support data
structures. External memory construction algorithms for the LCP array have been
proposed, but those proposed so far have a space requirement of words
(i.e. bits) in external memory. This space requirement is in some
practical cases prohibitively expensive. We present an external memory
algorithm for constructing the bit version of the LCP array which uses
bits of additional space in external memory when given a
(compressed) BWT with alphabet size and a sampled inverse suffix array
at sampling rate . This is often a significant space gain in
practice where is usually much smaller than or even constant. We
also consider the case of computing succinct LCP arrays for circular strings
Efficient and Compact Representations of Some Non-canonical Prefix-Free Codes
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-46049-9_5[Abstract] For many kinds of prefix-free codes there are efficient and compact alternatives to the traditional tree-based representation. Since these put the codes into canonical form, however, they can only be used when we can choose the order in which codewords are assigned to characters. In this paper we first show how, given a probability distribution over an alphabet of σσ characters, we can store a nearly optimal alphabetic prefix-free code in o(σ)o(σ) bits such that we can encode and decode any character in constant time. We then consider a kind of code introduced recently to reduce the space usage of wavelet matrices (Claude, Navarro, and Ordóñez, Information Systems, 2015). They showed how to build an optimal prefix-free code such that the codewords’ lengths are non-decreasing when they are arranged such that their reverses are in lexicographic order. We show how to store such a code in O(σlogL+2ϵL)O(σlogL+2ϵL) bits, where L is the maximum codeword length and ϵϵ is any positive constant, such that we can encode and decode any character in constant time under reasonable assumptions. Otherwise, we can always encode and decode a codeword of ℓℓ bits in time O(ℓ)O(ℓ) using O(σlogL)O(σlogL) bits of space.Ministerio de Economía, Industria y Competitividad; TIN2013-47090-C3-3-PMinisterio de Economía, Industria y Competitividad; TIN2015-69951-RMinisterio de Economía, Industria y Competitividad; ITC-20151305Ministerio de Economía, Industria y Competitividad; ITC-20151247Xunta de Galicia; GRC2013/053Chile. Núcleo Milenio Información y Coordinación en Redes; ICM/FIC.P10-024FCOST. IC1302Academy of Finland; 268324Academy of Finland; 25034
Kinematics and helicity evolution of a loop-like eruptive prominence
We aim at investigating the morphology, kinematic and helicity evolution of a
loop-like prominence during its eruption. We use multi-instrument observations
from AIA/SDO, EUVI/STEREO and LASCO/SoHO. The kinematic, morphological,
geometrical, and helicity evolution of a loop-like eruptive prominence are
studied in the context of the magnetic flux rope model of solar prominences.
The prominence eruption evolved as a height expanding twisted loop with both
legs anchored in the chromosphere of a plage area. The eruption process
consists of a prominence activation, acceleration, and a phase of constant
velocity. The prominence body was composed of left-hand (counter-clockwise)
twisted threads around the main prominence axis. The twist during the eruption
was estimated at 6pi (3 turns). The prominence reached a maximum height of 526
Mm before contracting to its primary location and partially reformed in the
same place two days after the eruption. This ejection, however, triggered a CME
seen in LASCO C2. The prominence was located in the northern periphery of the
CME magnetic field configuration and, therefore, the background magnetic field
was asymmetric with respect to the filament position. The physical conditions
of the falling plasma blobs were analysed with respect to the prominence
kinematics. The same sign of the prominence body twist and writhe, as well as
the amount of twisting above the critical value of 2pi after the activation
phase indicate that possibly conditions for kink instability were present. No
signature of magnetic reconnection was observed anywhere in the prominence body
and its surroundings. The filament/prominence descent following the eruption
and its partial reformation at the same place two days later suggest a confined
type of eruption. The asymmetric background magnetic field possibly played an
important role in the failed eruption.Comment: 9 pages, 8 figures, in press in A&
Encodings of Range Maximum-Sum Segment Queries and Applications
Given an array A containing arbitrary (positive and negative) numbers, we
consider the problem of supporting range maximum-sum segment queries on A:
i.e., given an arbitrary range [i,j], return the subrange [i' ,j' ] \subseteq
[i,j] such that the sum of the numbers in A[i'..j'] is maximized. Chen and Chao
[Disc. App. Math. 2007] presented a data structure for this problem that
occupies {\Theta}(n) words, can be constructed in {\Theta}(n) time, and
supports queries in {\Theta}(1) time. Our first result is that if only the
indices [i',j'] are desired (rather than the maximum sum achieved in that
subrange), then it is possible to reduce the space to {\Theta}(n) bits,
regardless the numbers stored in A, while retaining the same construction and
query time. We also improve the best known space lower bound for any data
structure that supports range maximum-sum segment queries from n bits to
1.89113n - {\Theta}(lg n) bits, for sufficiently large values of n. Finally, we
provide a new application of this data structure which simplifies a previously
known linear time algorithm for finding k-covers: i.e., given an array A of n
numbers and a number k, find k disjoint subranges [i_1 ,j_1 ],...,[i_k ,j_k ],
such that the total sum of all the numbers in the subranges is maximized.Comment: 19 pages + 2 page appendix, 4 figures. A shortened version of this
paper will appear in CPM 201
Succinct Data Structures for Families of Interval Graphs
We consider the problem of designing succinct data structures for interval
graphs with vertices while supporting degree, adjacency, neighborhood and
shortest path queries in optimal time in the -bit word RAM
model. The degree query reports the number of incident edges to a given vertex
in constant time, the adjacency query returns true if there is an edge between
two vertices in constant time, the neighborhood query reports the set of all
adjacent vertices in time proportional to the degree of the queried vertex, and
the shortest path query returns a shortest path in time proportional to its
length, thus the running times of these queries are optimal. Towards showing
succinctness, we first show that at least bits
are necessary to represent any unlabeled interval graph with vertices,
answering an open problem of Yang and Pippenger [Proc. Amer. Math. Soc. 2017].
This is augmented by a data structure of size bits while
supporting not only the aforementioned queries optimally but also capable of
executing various combinatorial algorithms (like proper coloring, maximum
independent set etc.) on the input interval graph efficiently. Finally, we
extend our ideas to other variants of interval graphs, for example, proper/unit
interval graphs, k-proper and k-improper interval graphs, and circular-arc
graphs, and design succinct/compact data structures for these graph classes as
well along with supporting queries on them efficiently
The Energy Landscape, Folding Pathways and the Kinetics of a Knotted Protein
The folding pathway and rate coefficients of the folding of a knotted protein
are calculated for a potential energy function with minimal energetic
frustration. A kinetic transition network is constructed using the discrete
path sampling approach, and the resulting potential energy surface is
visualized by constructing disconnectivity graphs. Owing to topological
constraints, the low-lying portion of the landscape consists of three distinct
regions, corresponding to the native knotted state and to configurations where
either the N- or C-terminus is not yet folded into the knot. The fastest
folding pathways from denatured states exhibit early formation of the
N-terminus portion of the knot and a rate-determining step where the C-terminus
is incorporated. The low-lying minima with the N-terminus knotted and the
C-terminus free therefore constitute an off-pathway intermediate for this
model. The insertion of both the N- and C-termini into the knot occur late in
the folding process, creating large energy barriers that are the rate limiting
steps in the folding process. When compared to other protein folding proteins
of a similar length, this system folds over six orders of magnitude more
slowly.Comment: 19 page
- …